AP® STATISTICS 
2006 SCORING GUIDELINES 


Question 6 


Intent of Question 


The primary goals of this question are to evaluate a student’s ability to apply the concepts of significance testing 
to a new setting, in particular to: (1) state hypotheses for a parameter of interest, given a research question; (2) 
evaluate a new test statistic and use the probability distribution associated with that statistic to test the hypotheses 
of interest; (3) identify the values of the test statistic that would lead to rejection of the null hypothesis on a 
graph; and (4) interpret simulated sampling distributions for different populations. 


Solution 


Part (a): 


Let o* denote the variance in the temperatures measured by the thermostats recently produced by this 
manufacturer. 


Ho: o* =1.52(°F)’ OR Recently produced thermostats are not more variable than thermostats produced 
in the past. 


H,: o> 1.52CF ig OR Recently produced thermostats are more variable than thermostats produced in 


the past. 
Part (b): 
(n—1)s* _ 9X (1.4277) _ 9X (2.0383) _ 
152° 1.52 . 1.52 er 
Part (c): 


The test statistic has a x distribution with 9 degrees of freedom under H, . The chance of exceeding the 
observed value of 12.069, under H,, is 


p-value = P(y5 = 12.069) = 0.2094. 


(or, from the table, .20 < p-value < .25). Since the p-value is greater than 0.05, we cannot reject the null 
hypothesis. That is, we do not have statistically significant evidence that recent thermostats are less reliable 
(more variable) than in the past. 


Part (d): 


The smallest value that would have led to the rejection of the null hypothesis is the 95" percentile of the 
x distribution with 9 degrees of freedom, which is 16.92. The rejection region contains all values greater 


than or equal to 16.92. This region should be identified on the graph by indicating the approximate location 
of 16.92 on the axis and shading the region that is bounded by the vertical line through 16.92, the horizontal 


axis, and the rv curve. 
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Question 6 (continued) 


Part (e) 


Indicate the region to the right of 16.92 on all three histograms. 


Part (f) 


The population with the largest variance will tend to produce the largest values of s* in the simulation and 
hence the largest test statistics. Histogram III has the largest probability of producing a sample that would 
lead to the rejection of H, so Histogram III corresponds to the population with the largest variance. 


Similarly, the test statistics will tend to be smallest for the population with variance closest to 1.52. 
Histogram II has the smallest probability of producing a sample that would lead to the rejection of Hp so 


Histogram II corresponds to the population with the smallest variance. 


Scoring 


Each of four components are scored as essentially correct (E), partially correct (P), or incorrect (1). 


I. 


IL. 


Il. 


Parts (a) and (b) are combined into one component and scored as essentially correct (E) if both part (a) and 
part (b) are correct. 


Parts (a) and (b) are partially correct (P) if one of the two parts is correct. 

Notes: 

1. Ifa two-sided alternative is used or the hypotheses involve a mean, then part (a) is not correct. 
2. Nonstandard notation for the population variance must be defined. 


3. If the value of s (or of s°) is not shown in part (b), then part (b) is incorrect. 


Part (c) is scored as essentially correct (E) if both: 
e The p-value is given (or the test statistic compared to the critical value). 


e The conclusion is written in context and linked to the p-value. 
Part (c) is partially correct (P) if one of the two bulleted items is correct. 


Notes: 


1. Conditions (SRS, normal population) are given in the problem so it is not necessary to restate them. 
However, if incorrect conditions are given, the first bullet is incorrect. 


If the null hypothesis is “accepted” or equivalent, the second bullet is incorrect. 


3. If both ana and a p-value are given, the linkage is implied. If noa is given, the solution must be explicit 
about the linkage by giving a correct interpretation of the p-value or explaining how the conclusion 
follows from the p-value. 


Parts (d) and (e) are combined into one component and scored as essentially correct (E) if both: 
e ©The critical value is identified as 16.92. 


e The region to the right of a cut-off point of between 15 and 20 is identified in part (d). AND the same 
region is identified in each of the three histograms in part (e). 
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Question 6 (continued) 


Parts (d) and (e) are partially correct (P) if one of the two bulleted items above is correct. 


IV. Part (f) is essentially correct (E) if both: 


Histograms III and II are identified as the simulated sampling distributions for the populations with 
the largest and smallest variances, respectively. 


The justification refers to fact that Histogram III came from the population with the largest variance 
because the identified region is largest, and so it will be more likely to reject the null hypothesis. 
(Similarly for the smallest variance.) 


Part (f) is partially correct (P) if both: 


Histograms III and II are identified as the simulated sampling distributions for the populations with 
the largest and smallest variances, respectively. 


The justification says only that Histogram III represents the population with the largest variance 
because the identified region is largest. OR The justification refers to the fact that the simulated 
sampling distribution for the population with the largest variance should result in sample variances— 
and hence test statistics—that are centered at the largest values. (Similarly for the smallest variance.) 
OR The justification refers to the fact that the simulated sampling distribution for the population with 
the largest variance should result in sample variances—and hence test statistics—that are more 
variable and Histogram III has the more variable values of the test statistic. (Similarly for the 
smallest variance.) 


Part (f) is incorrect (I) if 


Note: 


Histograms III and II are identified as the simulated sampling distributions for the populations with 
the largest and smallest variances, respectively, but the justification refers only to the fact that these 
histograms themselves have the largest and smallest variability. 


1. Ifonly one of Histogram III or Histogram II is identified and correctly justified, the response is scored 
partially correct. 


For each of the four components, 
Essentially Correct (E) = 1 
Partially Correct (P) = 1/2 


Incorrect (1) = 0 
4 Complete Response 
3 Substantial Response 
2 Developing Response 
1 Minimal Response 


If a response is between two scores (for example, 22 points), use a holistic approach to determine whether 
to score up or down depending on the strength of the response and communication. 
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STATISTICS 
SECTION 11 
Part B 
Question 6 
Spend about 25 minutes on this part of the exam. 
Percent of Section II grade—25 


Directions: Show all your work. Indicate clearly the methods you use, because you will be graded on the 
correctness of your methods as well as on the accuracy and completeness of your results and explanations. 


variable). In the past, the variance has beegf 1.52 degrees Fahrenheit (F) squared. A random sample o 
recently manufactured thermostats was selected and placed in a room that was maintained at 68°F, [he readings 
for those 10 thermostats are given in the table below. oO in 525 


6. A manufacturer of thermostats is a 2 readings of its thermostats have become less reliable, (more 











[Themosiat “1 [a ta tats tetas tet) 
Temperature CF) | 668 [67.8] 706 | 69.3 | 659 | 66.2 | 68.1 | 68.6 [679] 672 


(a) State the null and alternative hypotheses that the manufacturer is interested in testing. 
Hot O 2-162 where ot is Me Tut popwianon 





It can be shown that if the population of thermostat temperatures: is normally distributed, the sampling 





distribution of eae follows a chi-square distribution with n — 1 degrees of freedom. 
o 





_ 1) 62 
(b) Calculate the value of (n 1) : 


—T532 for these data. 


r= (n-)8™ (lo-1 64 2B). 


eu 12.074 


GO ON TO THE NEXT PAGE. 
-16- 
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bA2 


(c) Assume that the population of thermostat temperatures follows a normal distribution. Use the test statistic 
(n -—1)s? . 
1.52 





from part (b) and the chi-square distribution to test the hypotheses in part (a). 


wre i2.014 
Pr (v2 d12,014 df= T= 62042 > A=.05 


We do not reject Ho, Tt ic Nor vnktly fo 
r yecults suchas ours ,and s2=¢.4a)7=4.034, 
given Mat g*=1.52, bascd on eranee CORE. 
we do noF have Shong evidence Mat ufex =,06 
tre reacung of mermesiats pac becrae lees 
relia. 7 


(d) For the test conducted in part (c), what js-the Frmallest value of the test statistic kh at would have led to the 
rejection of the null hypothesis at stig een significance level? . 
The sMatust tect stdkstic wat Wov Id have 
ud to rejection iS 1-42 


Mark this value of the test statistic on the graph of the chi-square distribution below. Indicate the region that 
contains all of the values that would have led to the rejection of the null hypothesis. 





0 5 10 15 20 25 30 35 
Chi-square values 


GO ON TO THE NEXT PAGE. 
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| n2l0 643 
(e) Using simulation, 000 samples,\each of § , were randomly generated from 3 populations with 
ac Popa ation was normally distributed with frean 68 d variance greater thé 1.52) 


differe: iaaces. 
eS 

(n -1)s? 
The histograms below show ihe simulated sampling distribution of ~_——- 152 for each population. 


Mark the region identified in part (d) on each of the histograms below. 


Histogram IE 


Histogram I | 





5 10 Ae 25 aera 40 45 50 55 60 


"| aS mene 
5 10 15 20 25 30 35 40 45 50 55 60 
Simulated Values 


Simulated Values 
Histogram III 


Frequency 


Case 
Ges 





thse Pe es 
5 10 15 20 25 30 35 40 45 50 55 60 
Simulated Values 
GO ON TO THE NEXT PAGE. 
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(f) Based on the regions that you marked in part (€), identify the simulated sampling distribution that 
corresponds to the population with the largest variance. Then identify the simulated sampling distribution 
that corresponds to the population with the smallest variance. Justify your choices. 





The populahon tat cor responds to Me lay Qest VONON Ce 
iS. in Hictegtann IT, beauuse mere is megreatkst Amb 
creo. WNC Vee ceurve after |le,4%, meaning Nar 
(+ would be more Ulelu te Choose a Sample Mar | 
Wold ,refect Me null hupoters Mat C7 =152, 
correctly 


The population Moat corresponds to pe smattest 
variong iS mak of tHsravam ar. his aishibunon 
hos Me Wost amt OF aren Under me courve after. 
0.42. ARON We tnow Mat me vanance {S Grenier 
don 1.52 for nis pOpUUettICN » Ht WoVIa We Me eASiest 
TUL POpUdariOn WIM Wii ge "MONE TRS Fae ne Smale 
enporexam 7 Mx closect fo 1,52. 


THE FOLLOWING INSTRUCTIONS APPLY TO THE COVERS OF THE 
SECTION tl BOOKLET. 


®. MAKE SURE YOU HAVE COMPLETED THE IDENTIFICATION 
INFORMATION AS REQUESTED ON THE FRONT AND BACK 
COVERS OF THE SECTION II BOOKLET. - 


e CHECK TO SEE THAT YOUR AP NUMBER LABEL APPEARS IN 
THE BOX(ES) ON THE COVER(S). 


e MAKE SURE YOU HAVE USED THE SAME SET OF AP 
NUMBER LABELS ON ALL AP EXAMS YOU HAVE TAKEN 
THIS YEAR. 
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STATISTICS 6 Pp | 


SEGTION II 
Part B 
Question 6 
Spend about 25 minutes on this part of the exam. 
Percent of Section II grade—25 


Directions: Show all your work. Indicate clearly the methods you use, because you will be graded on the 
correctness of your methods as well as on the accuracy and completeness of your results and explanations. 


6. A manufacturer of thermostats is concerned that the readings of its thermostats have become less reliable (more 
variable). In the past, the variance has been 1.52 degrees Fahrenheit (F) squared. A random sample of 10 


recently manufactured thermostats was selected and placed in a room that was maintained at 68°F. The readings 
for those 10 thermostats are given in the table below. 






[Thermostat Tt | 2 | 3 1 415 | 6 [7] 8 [9 J 10 | 
[ Temperature (°F)__| 66.8 | 67.8 | 70.6 | 69.3 | 65.9 | 66.2 | 68.1 | 68.6 | 67.9 | 67.2 | 


(a) State the null and alternative hypotheses that the manufacturer is interested in testing. 
Ho gf 2% = \.$2 
Ha' e* 7 |. S@ 










It can be shown that if the population of thermostat temperatures is normally distributed, the sampling 


2 
distribution of Gove: follows a chi-square distribution with n —.1 degrees of freedom. 
o 


1) 62 
(b) Calculate the value of (n—1)s 


~—§— for these data. 
25) 2% Slay 1,52. 
S*: |. 8344 (SeeAy4 B344 t \ 
hee 2 ee UO Ge 
n= 0 EF eA eee 
GO ON TO THE NEXT PAGE. 
“16+ 
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6BZ 


(c) Assume that the population of thermostat temperatures folie a — cistbanon: Use the test statistic 


(n-1)s° 


152 from part (b) and the chi-square distribution to test the hypotheses in part (a). 


P7.S 
the peobabrtity is geeatee than 
125 | | 
Do nok Reyect 4, Trae difference 


; MS ot 
is Nok SO Shcall a Significant : 


(d). For the test conducted in part (c), what is the smallest value of the test statistic that would have led to the 
rejection of the null hypothesis at the 5 percent significance level? 


\o 942 


Mark this value of the test statistic on the graph of the chi-square distribution below. Indicate the region that 
contains all of the values that would have led to the rejection of the null hypothesis. 





0 § 10 15 2 25 30 35 
Chi-square values 


GO ON TO THE NEXT PAGE. 
-17- 


© 2006 The College Board. All rights reserved. 
Visit apcentral.collegeboard.com (for AP professionals) and www.collegeboard.com/apstudents (for students and parents). 


(e) Using simulation, 1,000 samples, each of size 10, were randomly generated from 3 populations with 
different variances. Each population was normally distributed with mean 68 and variance greater than 1.52. 


(n-1)s? 


The histograms below show the simulated sampling distribution of is > for each population. 


Mark the region identified in part (d) on each of the histograms below. 


Histogram I Histogram II 








yi otis tS ed & eine as a Shea fae ORE Yer 2 aa ae it uAY 
5 10 15 20 25 30 35 40 45 50 55 60 5 10 15 20 25 30 35 40 45 50 55 60 © 
Simulated Values Simulated Values 
. Histogram IIT 





0 5 10 15 20 25 30 35 40 45 50 55 60 
Simulated Values 
GO ON TO THE NEXT PAGE. 
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(f) Based on the regions that you marked in part (e), identify the simulated sampling distribution that 2 


corresponds to the population with the largest variance. Then identify the simulated sampling distribution 
that corrésponds to the population with the smallest variance. Justify your choices. 


an s toa ka a << contaty § Ling Jag est 
VAR lance petavse | awa IS one Pek! 
BEE. Many NMolyes that Wwovld be 
\akse erough. fon Me to reject 
rhe null vypomesis that the 
Variance is “eqval yo 1.82 
HVistocqeana tn, con betns yy 2 omailest 


Vaelance becavsce 1% HyiS ene 0 Meed 
ARR wk | eo Vales Sinetuvedio 


be Vovege . 20000’ : Dae me on 
pe} ECE we VW £. Wu i : Wu Pe ry a is | 
‘STOP 
END OF EXAM 


THE FOLLOWING INSTRUCTIONS APPLY TO THE COVERS OF THE 
SECTION Il BOOKLET. 


e MAKE SURE YOU HAVE COMPLETED THE IDENTIFICATION 
INFORMATION AS REQUESTED ON THE FRONT AND BACK 
COVERS OF THE SECTION Il BOOKLET. 


e CHECK TO SEE THAT YOUR AP NUMBER LABEL APPEARS IN 
THE BOX(ES) ON THE COVER(S). 


e MAKE SURE YOU HAVE USED THE SAME SET OF AP 
NUMBER LABELS ON ALL AP EXAMS YOU HAVE TAKEN 
THIS YEAR. 
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STATISTICS 
SECTION I 
Part B 
‘Question 6 
Spend about 25 minutes on this part of the exam. 
Percent of Section II grade—25 


Directions: Show all your work. Indicate clearly the methods you use, because you will be graded on the 
correctness of your methods as well as on the accuracy and completeness of your results and explanations. 


6. A manufacturer of thermostats is concerned that the readings of its thermostats have become less reliable (more 
variable). In the past, the variance has been 1.52 degrees Fahrenheit (F) squared. A random sample of 10 
recently manufactured thermostats was selected and placed in a room that was maintained at 68°F. The readings 
for those 10 thermostats are given in the table below. 







[Thermostat 9s T 1 | 213 | 4.5 1677 4-8 1 9] 


Temperature CF) [668] 678| 70.6 [693 | 5.9 [662] 8.1 [686] 679 | 672, 


(a) State the null and alternative hypotheses that the manufacturer is interested in testing. 


Wot 0 =)92 whee 672 the eogoltton 
Yat o >\.52, Vonionce for thernest XS g 





It can be shown that if the population of thermostat temperatures is normally distributed, the sampling 


rqhek 
ee wk follows a chi-square distribution with n —1 degrees of freedom. 
o 





distribution of 





(n=1)s* 


T352 for these data. 


(b) Calculate the value of 


: nese carmela 
n=!0 S=\.4A766 vr Values, 


A CAMEY = 19,0684 


92. 


GO ON TO THE NEXT PAGE. 
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(c) Assume that the population of thermostat temperatures follows a normal distribution. Use the test statistic 
—1)5? 
. — from part (b) and the chi-square distribution to test the hypotheses in part (a). 


Yslng, Ak degrees a Freedorr s(n!) ond te ON-Squwr~e 
dlaribuRon dnovk, we dan determire that te Tall erababyiit 
‘eo beneath 196 06d 2 (olecloceer be 2), Since He 
Paidue oP Heres elotistit Is sa Ng) We cannes} 
cafect- the ool\ hypothesis underany PLALORNAG\Z. 

lourel of Slay Ficonve. (O=,0S), 


(d) For the test conducted in part (c), what is the smallest value of the test statistic that would have led to the 
rejection of the null hypothesis at the 5 percent significance level? 


lead +o We regection oF We cull hytresis would be 16.42. 


Mark this value of the test statistic on the graph of the chi-square distribution below. Indicate the region that - 


contains all of the values that would have led to the rejection of the null hypothesis. 





0 5 10 15\ 20 25 30 35 
Chi-sduare values 


\G.4D 


GO ON TO THE NEXT PAGE. 
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OC 


(e) Using simulation, 1,000 samples, each of size 10, were randomly generated from 3 populations with 
- different variances. Each population was normally distributed with mean 68 and variance greater than 1.52. 
ae er 
eds: for each population. 


The histograms below show the simulated sampling distribution of 153 


Mark the region identified in part (d) on each of the histograms below. 


Histogram I ve . . Histogram H 










Frequency 





a Agee. — 
$20 25 30 35 40 45 50 55 60 , 20 25 30 35 40 45 50 55 60 
imulated Values 


220 


NG. 7ae 


men 


fend fee feed 


Frequency 
oRSSBSRES 


Tee 
aH 
3; 

ak Nel 


5 





GO ON TO THE NEXT PAGE. 
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ei 


() Based on the regions that you marked in part (e), identify the simulated sampling distribution that 
corresponds to the population with the largest variance. Then identify the simulated sampling distribution 
that a a to the population with the emnallest variance. Justify: your choices. 


| Regul te Values deulobe much Fosther 


RES towords each a0 OVE A 













oA CG Pea 


The shoucked Ss ist abchan Hod-carees,eords Yo 
Ure ceeds COQuiorler with the semallesh Uenonvets 
pos elon ABS uolues do nor eulore mucns ond ONE 
= : OstAbdked Over a Norrowes> 


STOP COR 


END OF EXAM 





THE FOLLOWING INSTRUCTIONS APPLY TO THE COVERS OF THE 
SECTION Il BOOKLET. 


e MAKE SURE YOU HAVE COMPLETED THE IDENTIFICATION 
INFORMATION AS REQUESTED ON THE FRONT AND BACK 
COVERS OF THE SECTION ] BOOKLET. 


e CHECK TO SEE THAT YOUR AP NUMBER LABEL APPEARS IN 
THE BOX(ES) ON THE COVER(S). — 


e MAKE SURE YOU HAVE USED THE SAME SET OF AP 
NUMBER LABELS ON ALL AP EXAMS YOU HAVE TAKEN 
THIS YEAR. . 
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AP® STATISTICS 
2006 SCORING COMMENTARY 


Question 6 
Overview 


The primary goals of this question were to evaluate a student’s ability to apply the concepts of significance testing 
to a new setting; in particular to: (1) state hypotheses for a parameter of interest, given a research question; (2) 
evaluate a new test statistic and use the probability distribution associated with that statistic to test the hypotheses 
of interest; (3) identify the values of the test statistic that would lead to rejection of the null hypothesis on a graph; 
and (4) interpret simulated sampling distributions for different populations. 


Sample: 6A 
Score: 4 


This is a complete essay that reflects insight on how to use the newly-presented test statistic to test a single variance. 
There are two population variances in this situation: the variance of the readings of the population of thermostats in 
the past (known to be 1.52 degrees Fahrenheit squared) and the variance of the readings of recently manufactured 
thermostats. The hypotheses in part (a) could be improved by stating which of these populations is meant by the “true 
population.” The conclusion not to reject the null hypothesis in part (c) is linked to the p-value and is written in the 
context of the variability of the thermostat readings. The satisfactory explanation of the p-value in the conclusion 
would be better if it defined “results such as ours”: It is not unlikely to get a sample variance as large as or even 


larger than in this sample, given that o” =1.52 based on chance alone. In part (f) the essay nicely describes how 
the shaded regions in part (e) relate to the concept of Type I error. This essay earned a score of 4. 


Sample: 6B 
Score: 3 


The hypotheses in part (a) could be improved by stating which of the variances is represented by the symbol o7. In 


part (b) the sample variance s* = 1.43? should have been used in the computation. The value given, 1.354, is 
computed by dividing by the sample size of 10 rather than by n — 1 = 9. In part (c) the p-value follows consistently 
from the (incorrect) test statistic in part (b), but the p-value is not linked to the conclusion. Linkage could have been 
achieved by appealing either to a rejection region or to the strength of the evidence against the null hypothesis, as 
stated below. 


Select a significance level, say a = 0.05 and state that the p-value is larger than a. Conclude that we do 
not reject Hp, followed by a statement to that effect in context. 


State that if the variance has remained 1.52, there is a 0.2092 chance of getting a sample variance as large as 
or even larger than the one from the sample. Thus, with a p-value this large, the evidence against the null 
hypothesis is not strong, and so there is no reason to conclude that the variance has increased. 


Further, the conclusion lacks context and should refer to the “increase’’ rather than the “difference” in variance. The 
conclusion correctly states, “Do not reject Ho.” If the conclusion had been written, “Accept Hy ” (or the equivalent, 
such as stating that the variance of the recently manufactured thermostats is still 1.52), it would have been scored as 
incorrect. Part (f) correctly links the ability to reject the null hypothesis with the largest region but does not make the 
link to why the population with the largest variance will have the largest region. This essay earned a score of 3. 
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AP® STATISTICS 
2006 SCORING COMMENTARY 


Question 6 (continued) 


Sample: 6C 
Score: 2 


The hypotheses in part (a) could be improved by stating which of the two populations (readings of thermostats in the 


past or readings of recently manufactured thermostats) has its variance represented by o*. The conclusion in part (c) 
nicely links the correct p-value to the conclusion but is not written in the context of the situation. In part (e) the 
critical value is marked, but no region (right tail) is indicated. In part (f) there is no evidence of understanding what 
the histograms represent. The justification for selecting Histograms III and II refers only to the spread of those 
histograms themselves. A connection to the populations has not been established, i.e., why the population with the 
variance farthest above 1.52 would result in a sampling distribution of this test statistic with the largest region and 
hence be most likely to reject the incorrect null hypothesis. This essay earned a score of 2. 
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